E4 — Machine Learning
نویسنده
چکیده
Machine learning’s focus on ill-defined problems and highly flexible methods makes it ideally suited for KDD applications. Among the ideas machine learning contributes to KDD are the importance of empirical validation, the impossibility of learning without a priori assumptions, and the utility of limited-search or limited-representation methods. Machine learning provides methods for incorporating knowledge into the learning process, changing and combining representations, combatting the curse of dimensionality, and learning comprehensible models. KDD challenges for machine learning include scaling up its algorithms to large databases, using cost information in learning, automating data pre-processing, and enabling rapid development of applications. KDD opens up new directions for machine learning research, and brings new urgency to others. These directions include interfacing with the human user and the database system, learning from non-attribute-vector data, learning partial models, and learning continuously from an open-ended stream of data. E4.1 Use of machine learning methods for KDD Machine learning is characterized by a focus on complex representations, ill-defined problems, and search-based methods. Representations studied include most of those described in Section B2, but particularly decision trees, sets of propositional or first-order rules, sets of instances, clusters, concept hierarchies and probabilistic networks. Ill-defined problems studied include generalizing from a set of tuples in the absence of a known model structure [C5.1], clustering [C5.5], combining logic theories of a domain with learning (de Raedt, 1996), and learning from delayed feedback in very large decision spaces (Sutton & Barto, 1998). Search methods [B8] used for learning include greedy search, gradient descent, expectation maximization, genetic algorithms, and some forms of lookahead and pruned breadth-first search. Other types of search frequently found in artificial intelligence, like best-first search and simulated annealing, tend to see less use in machine learning, for reasons discussed below. The flexibility of most machine learning methods makes them well suited to applications where little is known a priori about the domain, and/or relevant knowledge is hard to elicit. This flexibility also means they are often able to successfully learn from data that was not gathered by a purposely designed experimental procedure, but rather obtained by some process whose end goal was not necessarily knowledge discovery. The flip side of this is that theoretical analysis of machine learning methods is often difficult, and strong guarantees regarding the correctness of results are
منابع مشابه
Design of a Novel Intelligent Framework for Finding Experts and Learning Peers in Open Knowledge Communities
Open knowledge communities (OKCs) are computer supported collaborative learning environments that provide opportunities for social knowledge construction, collaboration, participation and communication for ubiquitous learning and informal learning. However, with the rapid expanding of learning content resources and users, it is difficult for learners to find the right persons they need as knowl...
متن کاملThe relationship between midlife and late life alcohol consumption, APOE e4 and the decline in learning and memory among older adults.
AIMS The aim of the study was to determine whether the trajectory of learning and memory is modified according to an interaction between midlife or late life alcohol consumption status and the presence of one or more APOE e4 alleles. METHODS This was a secondary analysis of cognitive, genetic and alcohol consumption data collected from members of the Framingham Heart Study Offspring Cohort. ...
متن کاملSwitching Brains: Cloud-based Intelligent Resources Management for the Internet of Cognitive Things
Cognitive technologies can bring important benefits to our everyday life, enabling connected devices to do tasks that in the past only humans could do, leading to the Cognitive Internet of Things. Wireless Sensor and Actuator Networks (WSAN) are often employed for communication between Internet objects. However, WSAN face some problems, namely sensors’ energy and CPU load consumption, which are...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملApolipoprotein E4 influences growth and cognitive responses to micronutrient supplementation in shantytown children from northeast Brazil
OBJECTIVE Apolipoprotein E4 may benefit children during early periods of life when the body is challenged by infection and nutritional decline. We examined whether apolipoprotein E4 affects intestinal barrier function, improving short-term growth and long-term cognitive outcomes in Brazilian shantytown children. METHODS A total of 213 Brazilian shantytown children with below-median height-for...
متن کاملComparative Analysis of Machine Learning Algorithms with Optimization Purposes
The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches. Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data. In this paper, a methodology has been employed to opt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007